This document contains some overall description of the various declassified satellite imagery datasets that I’ve been playing around with for the last few months. They were downloaded from https://earthexplorer.usgs.gov/.
The dataset consists of 837,088 images taken between 1960 and 1984 by 5 different satellite systems (see Appendix A for more information about the different datasets).
The following plot shows the temporal distribution of photos, broken down by data source.
n_pics_per_year_grouped <- sat %>% mutate(`Year` = as.numeric(substr(`Acquisition Date`,1,4))) %>%
group_by(Year, `Data Source`) %>%
summarise(n_pics = n())
## `summarise()` has grouped output by 'Year'. You can override using the
## `.groups` argument.
n_pics_per_year_grouped %>%
ggplot(aes(x = Year)) +
geom_area(aes(y = n_pics, fill = `Data Source`),
alpha = 0.9) +
scale_fill_manual(values = c("declass1" = "#7eb0d5", "declass2" = "#fd7f6f", "declass3" = "#01A66F")) +
ylab("Number of Pictures")
## Warning: Removed 1 rows containing non-finite values (`stat_align()`).
It’s worth mentioning that the types of images available vary widely both within and across datasets. To take just one dimension of variation, here is the average footprint of photos for each dataset:
sat_sf <- st_as_sf(sat, wkt = "geometry")
sat_sf$area <- st_area(sat_sf)
sat_sf_trimmed <- sat_sf %>%
filter(area != 0)
avgs <- sat_sf_trimmed %>%
st_drop_geometry() %>%
as.data.frame() %>%
group_by(`Data Source`) %>%
summarise(avg_area = mean(area))
# Calculate the average area of the geometries
mean(sat_sf_trimmed$area) # ~3.22, or about 69*69*3.22 = 15330 sq miles
## [1] 3.223478
avgs %>%
ggplot(aes(x = `Data Source`, y = avg_area)) +
geom_col(fill = c("declass1" = "#7eb0d5", "declass2" = "#fd7f6f", "declass3" = "#01A66F")) +
ylab("Average Area (in lat/lng units)")
Many of the frames–especially from the earlier programs in
declass1–have truly massive footprints. The unit is “square
lat/long points,” which is a bit of an imprecise unit, but which I think
corresponds to roughly 69*69 = ~ 4800 square miles. The overall average
size for an image frame in the dataset is thus about 15,000 square
miles.
What we really care about is when these satellite images contain nuclear targets, such as facilities. By cross-referencing with the coordinates from dataset of facilities that Quido collated, we can get a rough count of which facilities have been photographed (acknowledging that many of these “capture occurences” may be false positives). See Appendix B for more on how captures were counted.
Here is the facility coverage chart, which you’ve already seen. #### Calculate and Plot Facility Coverage
# Build coverage DF
get_coverage <- function(facs, fac_caps, counting_unknown) {
coverage <- data.frame(year = integer(),
num_extant_facilities = integer(),
num_spotted_facilities = integer(),
coverage = double())
spotted <- c()
# get date range
min_date = min(facs$start_date, na.rm = TRUE)
max_date = max(facs$start_date, na.rm = TRUE)
# replace NA with 1900 if we want to include NAs in analysis
if(counting_unknown) {
facs <- facs %>%
mutate(start_date = ifelse(is.na(start_date), as.Date("1900-01-01"), start_date))
}
# loop thru years
for (year in as.integer(format(min_date, "%Y")):as.integer(format(max_date, "%Y"))) {
year_date <- as.Date(paste(year, "-01-01", sep = ""))
# extant facilities
fac_exist <- facs %>%
filter(start_date <= year_date) %>%
distinct(facility_name) # Get unique facility names
n_fac_exist <- nrow(fac_exist)
# spotted facilities
fac_spotted <- fac_caps %>%
filter(`Acquisition Date` <= year_date) %>%
distinct(facility_name) # Get unique facility names
spotted <- unique(append(spotted,fac_spotted$facility_name))
n_fac_spotted <- length(spotted)
# remove earlier facilities for next loop
fac_caps <- fac_caps %>%
filter(`Acquisition Date` > year_date)
coverage <- bind_rows(coverage, data.frame(year = year,
num_extant_facilities = n_fac_exist,
num_spotted_facilities = n_fac_spotted,
coverage = n_fac_spotted / n_fac_exist))
}
return(coverage)
}
This version ignores facilities with unknown start dates:
coverage <- get_coverage(facs, fac_caps_no_unknown, counting_unknown=FALSE)
ggplot(coverage, aes(x = year)) +
geom_ribbon(aes(ymin = 0, ymax = num_extant_facilities, fill = "Extant Facilities"), alpha = 0.75) +
geom_ribbon(aes(ymin = 0, ymax = num_spotted_facilities, fill = "Spotted Facilities"), alpha = 0.75) +
scale_fill_manual(values = c("Extant Facilities" = "#7eb0d5", "Spotted Facilities" = "#fd7f6f")) +
labs(y = "Coverage",
x = "Year",
fill = "") +
theme_minimal()
And this version assumes all facilities with unknown start dates were built in 1900:
coverage <- get_coverage(facs, fac_caps_with_unknown, counting_unknown=TRUE)
ggplot(coverage, aes(x = year)) +
geom_ribbon(aes(ymin = 0, ymax = num_extant_facilities, fill = "Extant Facilities"), alpha = 0.75) +
geom_ribbon(aes(ymin = 0, ymax = num_spotted_facilities, fill = "Spotted Facilities"), alpha = 0.75) +
scale_fill_manual(values = c("Extant Facilities" = "#7eb0d5", "Spotted Facilities" = "#fd7f6f")) +
labs(y = "Coverage",
x = "Year",
fill = "") +
theme_minimal()
By group:
fac_caps <- fac_caps_with_unknown %>%
mutate(`Abbreviated Mission` = substr(Mission,1,4))
caps_by_year_source <- fac_caps %>%
group_by(facility_name, `Acquisition Date`, `Abbreviated Mission`, start_date, `Data Source`) %>%
summarise(n_caps = n()) %>%
mutate(`Year` = as.numeric(substr(`Acquisition Date`,1,4))) %>%
group_by(Year, `Data Source`) %>%
summarise(n_caps = sum(n_caps)) %>%
ungroup()
## `summarise()` has grouped output by 'facility_name', 'Acquisition Date',
## 'Abbreviated Mission', 'start_date'. You can override using the `.groups`
## argument.
## `summarise()` has grouped output by 'Year'. You can override using the
## `.groups` argument.
caps_by_year_source %>%
ggplot(aes(x = Year)) +
geom_col(aes(y = n_caps, fill = `Data Source`),
alpha = 0.9) +
scale_fill_manual(values = c("declass1" = "#7eb0d5", "declass2" = "#fd7f6f", "declass3" = "#01A66F"))
By camera resolution:
caps_by_cam <- fac_caps %>%
group_by(facility_name, `Acquisition Date`, `Abbreviated Mission`, start_date, `Camera Resolution`) %>%
summarise(n_caps = n()) %>%
mutate(`Year` = as.numeric(substr(`Acquisition Date`,1,4))) %>%
group_by(Year, `Camera Resolution`) %>%
summarise(n_caps = sum(n_caps)) %>%
ungroup()
## `summarise()` has grouped output by 'facility_name', 'Acquisition Date',
## 'Abbreviated Mission', 'start_date'. You can override using the `.groups`
## argument.
## `summarise()` has grouped output by 'Year'. You can override using the
## `.groups` argument.
caps_by_cam %>%
ggplot(aes(x = Year, y = n_caps, fill = `Camera Resolution`)) +
geom_col()
Generate a table with the counts of visits for each facility; visitation leaderboard
options(knitr.max.print = 200)
fac_caps <- fac_caps_with_unknown %>%
mutate(`Abbreviated Mission` = substr(Mission,1,4))
capture_counts <- fac_caps %>%
group_by(facility_name, `Acquisition Date`, `Abbreviated Mission`, start_date) %>%
summarise(n_caps = n())
## `summarise()` has grouped output by 'facility_name', 'Acquisition Date',
## 'Abbreviated Mission'. You can override using the `.groups` argument.
totals <- capture_counts %>%
ungroup() %>%
group_by(facility_name) %>%
summarise(n_caps = sum(n_caps))
missing <- setdiff(unique(facs$facility_name), unique(capture_counts$facility_name))
missing_df <- data.frame(facility_name = missing, n_caps = numeric(9))
totals <- rbind(totals, missing_df)
totals <- totals %>%
arrange(-n_caps)
print(totals, n = 157)
## # A tibble: 157 × 2
## facility_name n_caps
## <chr> <dbl>
## 1 Laboratory No. 2, Moscow 626
## 2 Radium Institute 562
## 3 Leningrad Compressor Plant 559
## 4 Commercial Centrifuge Plant 1, Urals Electro-Chemical Combine, Novou… 478
## 5 Commercial Centrifuge Plant 2, Urals Electro-Chemical Combine, Novou… 478
## 6 Commercial Centrifuge Plant 3, Urals Electro-Chemical Combine, Novou… 478
## 7 Commercial Centrifuge Plant 4, Urals Electro-Chemical Combine, Novou… 478
## 8 D1, Urals Electro-Chemical Combine, Novouralsk 478
## 9 D3, Urals Electro-Chemical Combine, Novouralsk 478
## 10 D4, Urals Electro-Chemical Combine, Novouralsk 478
## 11 D5, Urals Electro-Chemical Combine, Novouralsk 478
## 12 Pilot Centrifuge Plant, Urals Electro-Chemical Combine, Novouralsk 478
## 13 SU-20, Plant 418 473
## 14 Gorky Machine Building Plant 466
## 15 Siberian Chemical Combine (Seversk) II 445
## 16 Siberian Chemical Combine - Diffusion (Seversk formerly Tomsk-7) 440
## 17 Sverdlovsk Laboratory of Electric Phenomena 431
## 18 Krasnoyarsk-45 Electronchemical Plant - Centrifuge (Zelenogorsk in L… 408
## 19 Krasnoyarsk-45 Electronchemical Plant - Diffusion (Zelenogorsk in La… 408
## 20 B Plant, Mayak (Chelyabinsk-65) 401
## 21 Krasnoyarsk-26 398
## 22 S-2, Arzamas-16 388
## 23 RIAR (Research Institute of Atomic Reactors) 371
## 24 Dneproetrovsk Physicochemical Institute 348
## 25 Ukrainian Physicotechnical Institute in Khar’kov 324
## 26 Nuclear Research Institute of Czechoslovakia, Rez 308
## 27 Juiquan Atomic Energy Complex (Plant 404) 248
## 28 Angarsk ElectroChemical Combine, Centrifuge 237
## 29 Angarsk ElectroChemical Combine, Diffusion 237
## 30 Enrichment Technology Company Ltd. Zweigniederlassung Deutschland 235
## 31 China Institute of Atomic Energy (Diffusion Lab) 233
## 32 China Institute of Atomic Energy (Radiochemistry Research Institute) 233
## 33 Lanzhou 1 (China Institute of Atomic Energy Tuoli, Plant 504) 215
## 34 Lanzhou 3 (Indigenous Centrifuge Plant II) 214
## 35 Heping (located in Jinkouhe, Sichuan Province, Plant 814) 202
## 36 Studsvik Research Center 195
## 37 Stockholm Extraction Laboratory 193
## 38 Isotope Production Laboratory 186
## 39 Radio Chemical Laboratory (Yongbyon) 184
## 40 Vinca Electromagnetic Isotope Separator (Vinca Laboratory of Physica… 184
## 41 Vinca Reprocessing Center near Belgrade (Boris Kidric Institute of N… 184
## 42 Hanzhong, Shaanxi Uranium Enrichment Plant, Hanzhong II 178
## 43 Plant 405 Pilot Centrifuge Plant, Hanzhong 178
## 44 Institute A (near Sukhumi) 173
## 45 Institute G (near Sukhumi) 173
## 46 Al Tuwaitha Chemical Ion Enrichment Facility 169
## 47 Al Tuwaitha Gas Diffusion Facility 169
## 48 Jozef Stefan Institute near Ljubljana 162
## 49 Juiquan 2 (Atomic Energy Complex) 160
## 50 Rudjer Boskovic Institute 160
## 51 Kjeller Pilot Uranium Reprocessing Plant 154
## 52 Plutonium Laboratory at Kjeller 154
## 53 Nahal Soreq 137
## 54 Negev Nuclear Research Center, Dimona Machon 2 132
## 55 Negev Nuclear Research Center, Dimona Machon 8 132
## 56 Al Hashan Enrichment Facility 129
## 57 Plutonium Separation Facility at Tajura Nuclear Research Center 124
## 58 Tajoura Enrichment Facility 124
## 59 Nuclear Fuel Component Plant (Pant 812, Yibin Sichuan) 121
## 60 Mol, Purex Reprocessing Facility, Eurochemic 117
## 61 Le Bouchet -- Lab-Scale Reprocessing Plant 115
## 62 Dounreay Reprocessing Facility 112
## 63 Karlsruhe Nuclear Research Center, Institute for Nuclear Process Eng… 112
## 64 NDA Reprocessing Plant MTR 106
## 65 MOX Demonstration Facility 104
## 66 NDA B205 Magnox Reprocessing 104
## 67 NDA B205 Plutonium Operating Corridors 104
## 68 FONTENAY: BÂTIMENT PLUTONIUM (Building 19) 103
## 69 FONTENAY: BÂTIMENT RADIOCHIMIE (Building 18) 103
## 70 Plutonium Chemistry Laboratory (LCPu) -- Fontenay-aux-Roses 103
## 71 B204 Reprocessing Plant at Sellafield 102
## 72 NDA B203 Pu Residues Recovery Plant at Sellafield 102
## 73 Fontenay Pilot Reprocessing Plant 101
## 74 Pierrelate GDP 101
## 75 Plant 821 (Plutonium Production Complex in Guangyuan, Sichuan) 100
## 76 Capenhurst (E-23) 99
## 77 Capenhurst (GD) (E-22) 98
## 78 Capenhurst (Urenco) 98
## 79 Plasma Physics Laboratories in Tehran (TNRC) 90
## 80 Atelier Pilote 85
## 81 Juan Vigon Pilot Reprocessing Plant 81
## 82 Siberian Chemical Combine - Centrifuge (Seversk formerly Tomsk-7) 80
## 83 Areva NC La Hague – UP2 – 400 (renamed HAO facility in 1976) 77
## 84 Chemical Enrichment (KAERI) 77
## 85 Hot Cell Facility, (KAERI Facility) 77
## 86 KAERI (Laboratory for Quantum Optics at Korea Atomic Energy Research… 77
## 87 La Hague (Marcoule-UP1) 77
## 88 La Hague – AT1 77
## 89 RT–1, Combined Mayak in Ozersk/Chelyabinsk-65 73
## 90 Valindaba Z – Plant 72
## 91 ITREC at Trisaia 67
## 92 Laboratory Enrichment Facility at Pelindaba 67
## 93 Montreal Lab 57
## 94 Experimental Reprocessing Plant at Pakistan Institute of Nuclear Sci… 55
## 95 Chalk River Site 54
## 96 Bhabha Atomic Research Center (at Trombay) 48
## 97 Valindaba Y – Plant 48
## 98 Tehran Nuclear Research Center (Reprocessing) 46
## 99 Kalpakkam Reprocessing Plant (KARP) Laboratory 43
## 100 ATTILA, (FONTENAY: ATTaque d’Irradiés-combustibles-en Lits d’Alumine) 42
## 101 Eurex SFRE (MTR) at Saluggia in Vercelli 42
## 102 Chashma Reprocessing Facility 41
## 103 Tokai Test Facility 41
## 104 IPEN – Reprocessing 39
## 105 Ezeiza – SF Reprocessing Facility 32
## 106 Kahuta- KRL (A.Q. Khan Research Laboratories) 30
## 107 Chaklala 29
## 108 Institute for Nuclear Energy Reaction (INER) Reprocessing Facility I 29
## 109 Power Reactor Fuel Reprocessing (PREFRE), Bhabha Atomic Research Cen… 29
## 110 MILLI Reprocessing Test Facility 27
## 111 Institute for Nuclear Energy Reaction (INER) Reprocessing Facility II 26
## 112 PP35 Pierrelatte 26
## 113 Almelo SP1 (Dutch) 24
## 114 Almelo SP2 (German Plant adjacent to the Dutch one in Almelo) 24
## 115 Sihala 24
## 116 Karlsruhe Reprocessing Plant (WAK) 23
## 117 Almelo SP3 Demonstration 21
## 118 PL81 Grenoble 19
## 119 Pilcaniyeu Enrichment Facility I 19
## 120 Eurodif (Georges Besse I) 18
## 121 Ezeiza II – SF Reprocessing Facility 18
## 122 Hot Cell Facility at Inshas Nuclear Research Center 17
## 123 Plutonium Test Extraction Facility (Reprocessing Plant Karlsruhe) 16
## 124 Al Tuwaitha Laser 14
## 125 New Labs at PINSTECH 13
## 126 BRN Enrichment (Aramar Isotopic Enrichment Lab) Ipero, Sao Paulo 12
## 127 Eurex SFRE (Oxide) at Saluggia in Vercelli 12
## 128 Al Tarmiya (north of Baghdad) 11
## 129 Al Tuwaitha Hot Cell 10
## 130 Lucas Heights 10
## 131 Pilot Enrichment Plant- Belo Horizonte (INB Resende) 10
## 132 Aerospace Technical Center (Institute of Advanced Studies) 9
## 133 Pitesti Nuclear Research Institute 9
## 134 RT – 2 9
## 135 Almelo SP4 8
## 136 Asahi Uranium Enrichment Laboratory 8
## 137 JAEA Tokai Reprocessing Plant (Japan Nuclear Cycle Institute) 8
## 138 Reprocessing Test Facility (JRTF) 7
## 139 Silex Laser Enrichment Facility at Lucas Heights Science and Technol… 6
## 140 Urenco Germany GmbH, Gronau 6
## 141 BARC, Trombay (Pilot) 5
## 142 Capenhurst (E-21) 4
## 143 Negev Nuclear Research Center, Dimona Machon 9 4
## 144 Center for Advanced Technology, Laser Enrichment Plant 3
## 145 Eurex SFRE (PU Nitrate Line) Saluggia in Vercelli 2
## 146 Ningyo – Toge Uranium Pilot Plant 2
## 147 Valindaba (Laser) 2
## 148 Hot Cell Complex, Pelindaba Nuclear Research Center 1
## 149 INB Resende –Pilot Enrichment Facility, Rio De Janeiro 0
## 150 Areva NC La Hague – UP2 – 800 0
## 151 Areva NC La Hague – UP3 0
## 152 PL4 0
## 153 Pierrelatte, Laser 0
## 154 Wackersdorf Reprocessing Plant 0
## 155 JAEA Ningyo – Toge Enrichment Demo. Plant (DOP) 0
## 156 Capenhurst A-3 0
## 157 NDA Thorp 0
And here’s a histogram to give a sense of the distribution:
totals %>%
mutate(facility_name = reorder(facility_name,n_caps)) %>%
ggplot(aes(x = facility_name, y = n_caps)) +
geom_col() +
labs(x = NULL, y = "Average Area") +
theme(axis.text.x = element_blank())
Declassified 1 is the product of a blanket declassification in 1995 and purportedly represents all of the images from the following satellite programs:
CORONA: 1960-1972.
ARGON: 1962-64
LANYARD: 1963.
It’s not clear which images come from which satellites systems.
The dataset contains 837088 images. A handful of different camera setups were used during the program:
unique(sat1$`Camera Resolution`)
## [1] "Vertical Medium" "Stereo Medium" "Vertical Low" "Vertical High"
## [5] "Stereo High"
unique(sat1$`Camera Type`)
## [1] "Vertical" "Aft" "Forward" "Cartographic"
It’s not clear how the resolution of these cameras compares to the later generations, KH-7 and KH-9.
Here is the total “footprint” of the images in this dataset (shapefile supplied by USGS):
Declassified 2 is the product of a 2002 declassification involving the non-comprehensive declassification of imagery from the following programs:
KH-7 (GAMBIT): images taken between 1963 and 1967, the full lifespan of GAMBIT.
KH-9 (HEXAGON): images taken from 1973 to 1980, a subset of the operational period of HEXAGON.
It’s not clear whether all of the images from KH-7 were declassified or whether some were withheld. Only a subset of the KH-9 images were declassified.
The dataset contains 46,699 images.
KH-7 was used for higher-resolution surveillance. KH-9 had both a lower-resolution mapping camera and a higher-resolution surveillance camera, but only the mapping images were declassified in this declassification act:
unique(sat2$`Camera Resolution`)
## [1] "2 to 4 feet" "20 to 30 feet"
unique(sat2$`Camera Type`)
## [1] "KH-7 High Resolution Surveillance" "KH-9 Lower Resolution Mapping"
Here is the total “footprint” of the images in this dataset (shapefile supplied by USGS):
Declassified 3 is the product of a 2011 declassification involving the non-comprehensive declassification of imagery from KH-9 (HEXAGON), which ran from 1971 to 1984. This includes images from the high-resolution surveillance camera, but the website says that “almost all of the imagery from these cameras were declassified in 2011” implying that some images remain classified.
The dataset contains 531,321 images. Note that the website says that “The process to ingest and generate browse imagery for Declass-3 is ongoing,” and suggests that the HEXAGON program generated over 670,000 scenes, indicating that the dataset which we have access to is missing a substantial chunk of the images from HEXAGON.
Both the terrain mapping and surveillance imagery were included in this declassification:
unique(sat3$`Camera Resolution`)
## [1] "2 to 4 feet" "20 to 30 feet"
unique(sat3$`Camera Type`)
## [1] "High Resolution Surveillance Camera - Forward"
## [2] "High Resolution Surveillance Camera - Aft"
## [3] "Lower Resolution Terrain Mapping Camera"
Here is the total “footprint” of the images in this dataset (shapefile supplied by USGS):